Sorting in the Presence of Branch Prediction and Caches Fast Sorting on Modern Computers

نویسندگان

  • Paul Biggar
  • David Gregg
چکیده

Sorting is one of the most important and studied problems in computer science. Many good algorithms exist which offer various trade-offs in efficiency, simplicity and memory use. However most of these algorithms were discovered decades ago at a time when computer architectures were much simpler than today. Branch prediction and cache memories are two developments in computer architecture that have a particularly large impact on the performance of sorting algorithms. This report describes a study of the behaviour of sorting algorithms on branch predictors and caches. Our work on branch prediction is almost entirely new, and finds a number of important results. In particular we show that insertion sort causes the fewest branch mispredictions of any comparison-based algorithm, that optimizations such as the choice of the pivot in quicksort can have a large impact on the predictability of branches, and that advanced two-level branch predictors are usually worse at predicting branches in sorting algorithms than simpler branch predictors. In many cases it is possible to draw links between classical theoretical analyses of algorithms and their branch prediction behaviour. The other main work described in this report is an analysis of the behaviour of sorting algorithms on modern caches. Over the last decade there has been considerable interest in optimizing sorting algorithms to reduce the number of cache misses. We experimentally study the cache performance of both classical sorting algorithms, and a variety of cache-optimized algorithms proposed by LaMarca and Ladner. Our experiments cover a much wider range of algorithms than other work, including the O(N) sorts, radixsort and shellsort, all within a single framework. We discover a number of new results, particularly relating to the branch prediction behaviour of cache-optimized sorts. We also developed a number of other improvements to the algorithms, such as removing the need for a sentinel in classical heapsort. Overall, we found that a cache-optimized radixsort was the fastest sort in our study; the absence of comparison branches means that the algorithm causes almost no branch mispredictions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

درجه بندی کیفی ظروف چینی با استفاده از ماشین بینایی

One of the stages of quality control in porcelain producing factories is sorting that do with human eyes. Machine vision , including new methods for defect  detection and sorting of different products. In this study, with defects diagnosis and as a result sorting porcelain, use from linear structured light pattern, triangulation techniques and rules governing mirrors. Also, among the defec...

متن کامل

Optimal Placement of Phasor Measurement Units to Maintain CompleteObservability Considering Maximum Reliability by Non-dominated Sorting Genetic Algorithm-II (NSGA-II)

Ever-increasing energy demand has led to geographic expansion of transmission lines and their complexity. In addition, higher reliability is expected in the transmission systemsdue to their vital role in power systems. It is very difficult to realize this goal by conventional monitoring and control methods. Thus, phasor measurement units (PMUs) are used to measure system parameters. Although in...

متن کامل

Optimal Placement and Sizing of Distributed Generation Via an Improved Nondominated Sorting Genetic Algorithm II

The use of distributed generation units in distribution networks has attracted the attention of network managers due to its great benefits. In this research, the location and determination of the capacity of distributed generation (DG) units for different purposes has been studied simultaneously. The multi-objective functions in the optimization model are reducing system line losses; reducing v...

متن کامل

Hardware-Aware Algorithms and Data Structures

Various computer hardware components are affecting the running time of algorithms in different proportions, or may have severe implications on the accuracy of algorithms. In this dissertation we propose algorithms and data structures that are efficient and robust with respect to different hardware factors. The hardware factors affecting the running time that we consider include branch mispredic...

متن کامل

A Non-dominated Sorting Ant Colony Optimization Algorithm Approach to the Bi-objective Multi-vehicle Allocation of Customers to Distribution Centers

Distribution centers (DCs) play important role in maintaining the uninterrupted flow of goods and materials between the manufacturers and their customers.This paper proposes a mathematical model as the bi-objective capacitated multi-vehicle allocation of customers to distribution centers. An evolutionary algorithm named non-dominated sorting ant colony optimization (NSACO) is used as the optimi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005